Statistical model selection with “Big Data”
نویسندگان
چکیده
منابع مشابه
Statistical model selection with “Big Data”
Big Data offer potential benefits for statistical modelling, but confront problems like an excess of false positives, mistaking correlations for causes, ignoring sampling biases, and selecting by inappropriate methods. We consider the many important requirements when searching for a data-based relationship using Big Data, and the possible role of Autometrics in that context. Paramount considera...
متن کاملA Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection
Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....
متن کاملStatistical estimation with model selection
The purpose of this paper is to explain the interest and importance of (approximate) models and model selection in Statistics. Starting from the very elementary example of histograms we present a general notion of finite dimensional model for statistical estimation and we explain what type of risk bounds can be expected from the use of one such model. We then give the performance of suitable mo...
متن کاملA big data approach to acoustic model training corpus selection
Deep neural networks (DNNs) have recently become the state of the art technology in speech recognition systems. In this paper we propose a new approach to constructing large high quality unsupervised sets to train DNN models for large vocabulary speech recognition. The core of our technique consists of two steps. We first redecode speech logged by our production recognizer with a very accurate ...
متن کاملStatistical analysis of big data on pharmacogenomics.
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Cogent Economics & Finance
سال: 2015
ISSN: 2332-2039
DOI: 10.1080/23322039.2015.1045216